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Abstract 

Characterizing macromolecular kinetics from molecular dynamics (MD) simulations requires a 
distance metric that can distinguish slowly-interconverting states. Here we build upon diffusion 
map theory and define a kinetic distance for irreducible Markov processes that quantifies how 
slowly molecular conformations interconvert. The kinetic distance can be computed given a model 
that approximates the eigenvalues and eigenvectors (reaction coordinates) of the MD Markov 
operator. Here we employ the time-lagged independent component analysis (TICA). The TICA 
components can be scaled to provide a kinetic map in which the Euclidean distance corresponds to 
the kinetic distance. As a result, the question of how many TICA dimensions should be kept in a 
dimensionality reduction approach becomes obsolete, and one parameter less needs to be specified 
in the kinetic model construction. We demonstrate the approach using TICA and Markov state 
model (MSM) analyses for illustrative models, protein conformation dynamics in bovine pancreatic 
trypsin inhibitor and protein-inhibitor association in trypsin and benzamidine. 


1 Introduction 

Characterizing the metastable (long-lived) states, their equilibrium probabilities and the transition 
rates between them are essential aims of molecular dynamics (MD) simulation. Due to increasing 
availability of high-performance computing resources for mass production of MD data, there is an 
increasing interest in systematic and automatic construction of models of the metastable dynamics, 
such as Markov state models (MSMs) [HU [HI [TUI El EH E] and diffusion maps [TTl [56]. Alternative 
methods to analyze the space explored by MD simulations include sketchmap [9], PC A [T|, kernel-PCA 
[2T1 [3]. and likelihood maximization [T9|. See [27| for a more extensive review. 

Critical component in any analysis of metastable dynamics from MD data are (1) the choice of suitable 
coordinates and (2) a distance metric on these coordinates, such that slowly-interconverting states 
are distinguished. The first question has been answered: building on conformation dynamics theory 
[29l[28l[24l[T5] it has been shown that the eigenfunctions of the backward Markov propagator underlying 
the MD are the optimal reaction coordinates: Projecting the dynamics upon these eigenfunctions 
will give rise to a maximum estimate of the timescales [23l [T5l [TT] and an optimal separation of 
metastable states. A number of approaches are available to approximate these reaction coordinates 
from MD data: Diffusion maps [26], Markov state models [24] and Markov transition models [35], TICA 
[T8l[3Q] and kernel TICA [31]. The variational approach for conformation dynamics (VAC) [T5l[T7] is a 
generalization to all aforementioned models except for diffusion maps and describes a general approach 
to combine and parametrize basis functions so as to optimally define the true eigenfunctions of the 
backward propagator. 

The second question, i.e. the choice of a distance metric has been answered in the context of diffusion 
maps. Diffusion maps model the observed data as emerging from a diffusion process [TTl [T3]. This 
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approach has been used to model molecular conformation dynamics [26] and to guide further sampling 
m The diffusion distance, introduced in m defines a distance metric that measures how slowly 
conformations interconvert. However, the definition of the diffusion distance is not limited to diffusion 
processes or even to reversible Markov processes. 

Here we define the kinetic distance for Markov processes that have a unique equilibrium distribution, 
such as MD simulation. For reversible Markov processes the kinetic distance takes the same simple 
form as the diffusion distance and can be computed from any model that provides an approximation to 
the eigenvalues and eigenfunctions of the MD backward propagator. Here we employ the time-lagged 
independent component analysis (TICA) [181EO] which approximates these eigenvalues and eigenfunc¬ 
tions in terms of a linear combinations of molecular coordinates such as atomic positions, distances or 
angles. While TICA does not provide the best possible approximation to the true eigenfunctions, it 
can be directly applied to molecular dynamics data, requires few modeling decisions to be made, and is 
readily available in the Markov modeling packages EMMA |32) (www.pyemma.org) and MSMbuilder 

0 

Using the TICA eigenvalues and eigenvectors we employ the kinetic distance as a distance metric. 
As in diffusion maps, the approximate eigenvectors can be rescaled so as to define a kinetic map in 
which the Euclidean distance is equivalent to the kinetic distance. This provides an optimal space 
to perform clustering, Markov modeling, and diffusion map, or to use the Variational Approach for 
conformation dynamics and performing other analyses of the metastable dynamics. The kinetic map 
gives a clear answer to the previously arbitrary decision on how many dimensions should be kept 
in the TICA transformation: In principle all dimensions are kept, but as a result of the scaling the 
dimensions with small eigenvalues contribute very little to the kinetic distance. In order to reduce the 
computational cost, we can choose to keep only the eigenvectors that account for a certain fraction 
of the total variation in kinetic distance, as it is commonly practiced in principal component analysis 

[nil]. 

We demonstrate the approach using TICA and MSM analyses for illustrative models, protein confor¬ 
mation dynamics in bovine pancreatic trypsin inhibitor and protein-inhibitor association in trypsin and 
benzamidine. By invoking the Variational Principle m as a means of model comparison, it is shown 
that the kinetic map generally provides better MSMs than by using unsealed TICA with manually 
selected dimension. 


2 Theory 


2.1 Diffusion distance and kinetic distance 


Suppose we have a dynamical system (here molecular dynamics) with a state space Q. O formally in¬ 
cludes positions, momenta and if needed other state variables such as the simulation box size, although 
we will later make approximations such as working with only a subset of the configuration coordinates. 
Our dynamics generates a time sequence of states x G U by means of a Markovian algorithm, i.e. some 
implementation of time-step integrator, thermostat, etc. that allows us to compute the system state 
in the subsequent time step as a function of the current state. In this framework there is a transition 
density Pr(y | x), the probability density of finding the system at state y at time r given that we have 
started it at state x at time 0. Given these preliminaries we can write the propagation of a probability 
density of states pt(x) in time as: 


Pt+riy) 


/ Pt(x)Pr(y I x)dx 
J xEQ 

V o Pt(x) 


( 1 ) 

( 2 ) 


where V is the dynamical propagator, i.e. the Markov operator describing the action of the integral 
in 0 - Einally, we require that there is a unique equilibrium distribution (usually the Boltzmann 
distribution) defined by: 

7r(x) = V o 7r(x) (3) 

The above requirements are minimal as they are fulfilled by almost every implementation of molecular 
dynamics. There is an alternative description to 0 by using the backward propagator, T, (also known 
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(4) 

(5) 


as transfer operator [29]), which propagates the weighted densities 'r’t(x) = pt{'x.)/7r{x.). Thus: 

Vt+riy) = / Pr(y I x)7r(x)t>t(x)dx 

^(y) Jy 

= T OVt{x). 


We define the kinetic distance X 2 ) as the diffusion distance introduced in m in the more 

general context of the above irreducible Markov processes. X 2 ) is a distance measure between 

any of the two states xi, X 2 G and parametrically depends on a lag time r. T)t-(xi, X 2 ) is defined as 
the distance between the system’s probability densities at time r, given that we have initialized the 
system either at state xi or at state X 2 at time 0: 


Z)2(xi, X2) 


lbT(y I Xi) -Pr{y I X2)||yi 

f IPrjy I Xi) -Pr{y I X2)P 

Jyen ^(y) 


dy- 


(6) 

(7) 


The above definition is identical to the concept of the diffusion distance except for the fact that in 
principle we allow xi, X 2 and y to be not only positions but also momenta and other state variables. 
This modification is needed in order to derive expressions of the kinetic distance that also apply to 
nonreversible dynamics. Furthermore, the original name diffusion distance comes from the fact that 
it has been derived in a context where the dynamics Pr{y \ x) come from a diffusion process. Here 
we will apply (§ to a wider class of Markovian dynamics, and therefore use the term kinetic distance 
rather than diffusion distance throughout the article. 


2.2 Spectral decomposition 

In order to derive practically useful expressions of the kinetic distance, we need to conduct a spectral 
decomposition of 0 . Given the propagator V with eigenfunctions <pi and the backward propagator 
T with eigenfunctions and we suppose that our propagator has a number of n discrete eigenpairs. 
The rest of the spectrum is bounded by the ball with radius |An+i| and it will be called We can 

write the propagation of densities p as follows: 

n 

Pt+Ay) = y]A(r)(i/)j(x) I Pt{x))(pj{y) + O pt{x.). (8) 

i=i 

The norm of all eigenvalues decay exponentially with increasing lag time. We suppose that we operate 
at a lag time r such that |An+i| ~ 0, and thus o pt{'x.) is approximately 0 everywhere. Then 

we can effectively describe the dynamics as: 

n 

Pt+r{y) = y] A(r)(t/)j(x) I Pt{x))(j)j{y) (9) 

i=i 

and for the choice pt(x) = ^(x) this becomes: 

n 

Pt+riy) = A(T)V’i(x)</>j(y). (10) 

i=i 

We sort the eigenvalues by non-increasing norm: 

Al = 1 > IA 2 I > IA 3 I > ... > |An| (11) 

The eigenfunction = 7r(x) corresponding to the eigenvalue Ai = 1 is given by the Boltzmann 

equilibrium distribution. 

Note that we have not made any restriction with respect to the reversibility of the dynamics as it 
is usually done in Markov modeling [24|. Thus the eigenvalues A(r) can either be all real-valued, or 
consist of a mix of real eigenvalues and complex conjugate pairs. For the calculation of the kinetic 
distance these cases need to be treated slightly differently. 
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2.3 Computing kinetic distances and kinetic maps 

Nonreversible dynamics Inserting (10) into yields: 


L>2(xi, X 2 ) 


X] - V’i(x 2 )) </>i(y) 

i=i 


2 


( 12 ) 


As a result of '0i(x) = 1, the first term with j = 1 disappears. Evaluating the square norm leads to 
the following double sum: 


L>^(xi, X 2 ) = EE (■*/)j(xi) - \j{4>j I 4>k)n-^>^k - 'lpki'^ 2 )) (13) 

i=2 k=2 


The above expression is applicable to any Markov process with a unique stationary distribution. Un¬ 
fortunately (13) requires both the propagator and the backward propagator eigenvectors, and there 
are only few models for nonreversible dynamics that provide both sets of eigenvectors. Markov state 
models provide both eigenvectors (as left and right eigenvectors of the transition matrix). Moreover it 
is possible to design Markov transition models [35] such that both sets of eigenvectors can be computed. 


For nonreversible dynamics, there can be complex eigenvalues and eigenvectors which come in complex 
conjugate pairs and These are handled as follows: A complex conjugate pair 

is rewritten into a real pair of eigenvalues/eigenvectors by separating their real and imaginary parts: 


4>j,^j -> Re(<(>j), Im((/)j) (15) 

Re(V’j), Im(V’j) (16) 


Note that the sign is irrelevant, so it doesn’t matter if we use (Aj, or (Aj, 0^, 7 / 1 ^) for the imaginary 

part. The transformed eigenvalue/vector set which is still the same rank but now real-valued is inserted 
into Eq. (13) to compute the diffusion distance. 


Reversible dynamics For reversible dynamics the detailed balance equations hold. In that case 
every pair of points x, y satisfies: 

TT{x)pr{y I x) = Tr{y)pr{x I y). (17) 

In this case all eigenvalues Xi{r) are real and can be associated with relaxation rates K,i and timescales 
ti as: 

Ai(r) = (18) 

Moreover, all associated eigenvectors (pi are real and related by: 

(l)i{x) = TT{x)'lpi{x). (19) 

3 have {pj I 0/c)7r-i = Sjk where 6jk is the Dirac delta. Thus, only the j = k terms 
, which simplifies to: 

n 

D^(xi, X2) = y] (Ajt/)j(xi) - . (20) 

i=2 

Which is identical to the expression for the diffusion distance m- Note, however, we can apply it to any 
dynamics which is reversible in the state space used in our model. Diffusion maps employ Smoluchowski 
or Brownian dynamics which are reversible. Langevin dynamics, which is more commonly used for 
thermostatting MD simulations fulfills a generalized detailed balance in phase space with respect to 
momentum inversion. Moreover, it has been recently shown that Langevin dynamics fulfills 0 in 
position space when integrating over momenta [5]. Since kinetic models for estimating eigenvalues 
and eigenvectors are usually estimated in position space (or a subset thereof), Langevin dynamics is 
consistent with the reversible diffusion distance Eq. ( [20| ). 


Given Eq. (19), w< 
survive in Eq. (10 
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Based on Eq. (20), we can define the weighted coordinates 

= Ai(T)V;i(x), 


(21) 


which define the kinetic map: 

§= (^1, (22) 

The kinetic map is a new set of coordinates in which data points x have been transformed such that 
their Euclidean distance corresponds to their diffusion distance: 


L>^(xi, X2) 


5'(xi) - 5 '(x2) 


2 


(23) 


When the transition density p(y|x) originates from a diffusion process, then T(x) is called a diffusion 
map |13| . 

If the number of available eigenvectors n is large, then evaluating distances in ( [2^ is computationally 
costly. However, this is often unnecessary because many eigenvalues may be small and the correspond¬ 
ing dimensions in the kinetic map contribute little to the overall distance. We can employ an approach 
commonly used in principal component analysis with geometric distances: We compute the variance 
of the diffusion distance along each coordinate: 

= Af (r) = (24) 


and the cumulative variance fraction (remember the eigenvalues are sorted by decreasing norm) by: 

(25) 


Ck = 


E^AKr) 


Er=2Af(r) 


We can then decide to truncate the distance after k = K terms where a certain cumulative variance 
threshold (e.g. 95%). The approximate diffusion distance is then given by: 


L>r(xi, X 2 ) 


K 




i=2 


E - ^^ 2 )] 


(26) 


2.4 Algorithmic approach using TIC A 

In order to apply the kinetic distance and kinetic maps, we need an algorithm that will approximate 
the eigenvalues and eigenfunctions 'ip. As described in the introduction a number of methods are 
available for this. Here we choose to use the TICA method as implemented in pyEMMA. Although 
TICA only provides only a rather rough approximation to eigenvalues and eigenfunctions, it is very 
easy to use, very robust, and has few parameters to choose. Given MD data, we chose a (usually large) 
set of input coordinates {r^(t)}, such as Cartesian coordinates (if there is a reference to orient the 
solute molecule(s) to) or internal coordinates (inter-residue distances, rotamer dihedral angles, etc.). 
We define the mean-free coordinates: 


Viit) = nit) - (27) 

and compute the covariance matrix and time-lagged covariance matrix for a given lag time r: 

Cii(O) = (28) 

Cij{T) = {yi{t)yj{t + T))t (29) 

In practice, means and covariances are computed by their empirical estimators, and the time-lagged 
covariance matrix is symmetrized. We then solve the generalized eigenvalue problem: 

C(r)ri = C(0)riE('r). (30) 
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We have the following approximations for the eigenvalues and eigenfunctions of the backward propa¬ 
gator: 


A.: < 


-ipi = 

j 


Xi 

ipi 


(31) 

(32) 


The approach above is a special case of the Variational Approach of conformation dynamics [UlIITI, 
using the choice (27) as a basis set. According to the variational principle m, the eigenvalues Xi 
will only be exactly Xi if and otherwise will be underestimated. For finite data there are 

statistical errors on top of these approximation errors, and therefore the underestimation is denoted 
as an approximation in Eq. (31). 

We then apply the following approach: 


1. Perform TIC A and compute the eigenvectors and eigenvalues A^. 

2. Define the kinetic map by scaling all coordinates as 

(33) 


'^i = Xi{T)tpi 


3. The kinetic distance is defined by 


Z)x(xi,X2) 




(34) 


Note that TICA already projects the stationary eigenvector out as a consequence of its construction 
with mean-free coordinates (27). As a result, all approximated eigenvectors and eigenvalues are kept 
and the diffusion distance (20) runs through all terms 1, ..., n. 


3 Applications 


We demonstrate the behavior of the proposed kinetic distance using time-series of two pedagogical and 
two real molecular systems. All analyses are run with pyEMMA (www.pyemma.org). 

In all examples, we conduct the following data analysis: 

1. Transformation of the input coordinates to an approximation of eigenvalues and eigenfunctions 
(A 2 , '02), (An, using TICA. 

2. Cluster discretization of (i) the transformed space (ii) dimension-reduced versions of it (TICA 
projections) as previously practiced [IHIEQ], and (iii) the full-dimensional kinetic map (scaled 
TICA transformation) proposed here. We use the same clustering method and equal number of 
clusters for the same system to get comparable results. 

3. Compute Markov models using these different discretization and compare them using the varia¬ 
tional principle m 


In order to compare different Markov models we employ the variational principle of conformation 
dynamics m, which states that the approximated eigenvalues computed via ( [30| ) underestimate the 
true eigenvalues: 

Ai(T) < Aj(r) (35) 


and that equality is only obtained when the corresponding eigenfunction is correct (t/)j = ipi). As a 
consequence, we can make the same statement about approximated relaxation timescales: 


iiir) < ti{T). 


(36) 


Straightforward fitness functions to compare Markov models obtained in different ways are then the 
partial eigensum (sum over the first m eigenvalues) or the partial timescales sum. Here we use the 
mean relaxation timescale: 


MRT{m) = 


1 

m 


m 




^ m 

m ^ 




(37) 
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As we are doing only one TIC A projection per molecular system we do not use the TIC A timescales 
but rather the final MSM timescales in (37). This will compare the quality of the different projections 
/ metrics (full TICA, truncated TICA or scaled TICA / kinetic map) with respect to their ability to 
resolve the metastable dynamics using a fixed clustering approach. 

Two-state Hidden Markov Models 


We start by illustrating two pedagogical examples that are realized by Hidden Markov models (HMMs) 
with Gaussian output. In the first example, our approach works especially well and in the second 
example it fails as the result of a poor TICA approximation. 


In order to have a known reference for the timescales we fix a transition matrix between two metastable 
states: 


0.99 0.01 
0.01 0.99 


which leads to a single relaxation timescale of t 2 = 49.5 steps. In order to generate coordinates we 
let each of the two states sample from two-dimensional Gaussian distributions using means and 
covariance matrices given by: 


/^i = 


= 


-1 

1 

1 

-1 



2 ^ 

2 ^ 


In order to analyze systematic rather than statistical behavior we simulated this model for an extensive 
250,000 steps starting from state 1. In every simulation step we take a step using the transition matrix 
P (on average transitioning to the other state every 100 steps), and generate a point from Gaussian 
distribution 1 or 2, depending which state we are in. Fig. shows a scatterplot for the resulting 
simulation. Additional details on how to construct a HMM can be found in |25) . 

In this example, TICA works especially well. Fig. shows that while the main geometric variance 
is along the ^-direction (PCA would find that as principal component), the slow process is along the 
x-direction. Indeed the first independent component (IC) points exactly along the x-direction, while 
the second IC points towards a mixtures of x and y directions (note that the ICs are not orthogonal 
in Cartesian space but rather in the space weighted by the equilibrium distribution). The arrows are 
drawn proportional to the TICA eigenvalues which are Ai ^0.75 and A 2 ~ 0. 

We compare two metrics: Euclidean distance in the two-dimensional TICA space ^ = ('0i,'02)^ and 
the kinetic map 4/ = (A2'0i, A2'02)^- We conduct regular space clustering [24] using 27 clusters in 
both cases. Fig. shows the mean relaxation timescales as a function of the lag time for the two 
cases. It is seen that Euclidean distance in the kinetic map performs much better than the unsealed 
TICA space. The reason for this behavior is that the second eigenvalue is about zero, and thus the 
kinetic map effectively ignores '02- Hence, in the kinetic map version, the 27 clusters perform a fine 
discretization of the slow reaction coordinate, which has been accurately found by TICA. In contrast, 
in the unsealed TICA version, the 27 clusters are scattered over a two-dimensional space, leading to 
a much poorer resolution of the reaction coordinate x. Thus, the kinetic map is better because TICA 
has already identified the right coordinates - all we needed to do was to adjust the metric. 


Let us look at a counterexample, where the current approach fails. Although the theory of kinetic 
distance and kinetic map is correct, the crucial point is that we have to be able to generate a sufficiently 
good approximation of eigenfunctions and eigenvalues (A^, t/^^) in order to apply it. TICA does that by 
finding a linear combination of input coordinates and will fail when the true eigenfunctions are still 
highly nonlinear in these coordinates. So let’s design a pathological example. We use the four-state 
transition matrix: 


0.9 0.1 

0.1 0.89 0.01 

0.01 0.89 0.1 

0.1 0.9 


and define Gaussian distributions that output into two-dimensional Gartesian space: 


/^i = 




1 
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The simulation is run again for 250,000 steps and the coordinates from the TICA transformation or 
the kinetic map are discretized with 25 clusters using /c-means. 

Fig. ^ shows the distribution of simulated points. The slow transition occurs between the two 
interlaced ’T’ motives and requires a zigzag path. Thus, the reaction coordinate is highly nonlinear 
in the given input coordinates and TICA cannot find a linear combination that approximates the true 
reaction coordinate 7/^2 well. Consequently, both TICA coordinates are needed in order to resolve the 
reaction coordinate. Unfortunately, TICA projects the stationary process ^2 out by construction and 
the corresponding approximated eigenvalue A 2 becomes approximately zero. Since there are only two 
eigenvalues in this example, the second coordinate is lost. As a result, the kinetic map is poor and 
gives rise to a much poorer MSM than the discretization of the unsealed TICA coordinates (Fig. [^). 
Note that this example is extremely pathological and is only supposed to show that there are rare 
combinations of poor choices in which the current approach can break down. The kinetic map has 
one dimension less than the original input space, as the first eigenfunction is constant (in TICA this 
coordinate is projected out as a result of removing the mean in the basis functions ([^). Since in our 
example TICA needed both input coordinates for a good approximation of the reaction coordinate, 
a poor kinetic map was obtained. When a third input coordinate is added, this problem disappears. 
Molecular problems are usually high-dimensional and are thus not expected to cause the observed 
problem. However, a general lesson from this example is that kinetic map will only be a good approx¬ 
imation of the real kinetic map and thus lead to near-optimal distance metric, if we 

have a good approximation of the eigenvalues and eigenfunctions (Ai,'0i). Although TICA and more 
general the method of linear variation nzmi] only find linear combinations of input coordinates we 
can turn these methods into excellent approximators of nonlinear eigenfunctions by providing suitable 
input coordinates. In MD simulations, coordinates such as interaction distances, contacts or torsion 
angles are expected to play a role in the optimal reaction coordinates. Fortunately we do not need to 
make a restrictive choice of coordinates, but can simply add all promising coordinates to the input set 
and then run TICA or the method of linear variation in order to find good combinations. 

Bovine pancreatic trypsin inhibitor (BPTI) Let us turn to protein simulations. We analyzed a 
one-millisecond simulation of BPTI produced by D. E. Shaw research using the Anton supercomputer 
[ 33 ] (see there for simulation setup). The trajectory was subsampled every 10 ns, providing 100,000 
frames that are sufficient for our analysis. We then used the 174 coordinates of the 58 C^-atoms after 
aligning them to their means as an input data set. We consider four metrics: (i) Euclidean metric in 
the full TICA space, projection onto the first two (ii) and the first six (iii) IC’s, where gaps are found 
in the TICA timescales, and (iv) the kinetic map of all scaled TICA-coordinates. /c-means clustering 
with 100 clusters was used in all cases. 

Eig. [^,b) show that the results using two and six IC’s are similar, but the MSMs built on all TICA 
coordinates are much worse. The space is so high-dimensional that the clusters cannot efficiently 
discretize the slow reaction coordinates. In contrast, the kinetic map results are significantly better 
than all other setups, in particular in terms of finding a converged slowest relaxation timescale. 

In order to get an idea of the effective dimensionality of the kinetic map, Eig. H shows the cumulative 
kinetic variance as a function of the kinetic map dimension. 95% of the variance are obtained after 
only 13 dimensions, indicating that the data analysis pipeline can work with relatively low-dimensional 
data after the TICA step. 

Fig. P shows the first two dimensions of the BPTI kinetic map with correctly scaled coordinates. 
Three metastable states are apparent in this projection whose structures are depicted in Eig. |3^.1-3. 
The slowest conformational transition between the pair (1,2) and state (3) (about 60 /rs) involves 
an outward motion of the loop around residue 10 (top right in the structure). The second-slowest 
transition (about 20 /is) involves minor concerted motions in the loop region and an exchange between 
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an ordered set of structures (1) and a less ordered set of structures (2). Qualitatively, this analysis 
agrees with previous analyses of that system 133 ESI, but the relaxation timescale found here are larger 
than previously estimated. Following the variational principle this means a better model was found 
here. 


Trypsin-Benzamidine Finally, we analyzed protein-ligand association in Benzamidine and Trypsin 
using 491 trajectories of length 100 ns each that have been generated on GPUgrid [7) (see there for 
simulation details). The trajectories were saved every 100 ps, providing 491,000 frames for the analysis. 
As coordinates we chose the distance between the atom of benzamidine with the 0^ atoms of 
Trypsin, providing 223 distances. We consider four metrics: (i) Euclidean metric in the full TICA 
space, projection onto the first two (ii) and the first ten (iii) IC’s, and (iv) the kinetic map of all scaled 
TICA-coordinates. /c-means clustering using 100 clusters was used in all cases. 

Fig. ,b) show that the MSMs using two dimensions converge nicely to timescales of about 500 ns 
and 50 ns. For larger number of ICs (10 and all) the results become worse. On one hand 100 clusters 
are no longer sufficient to discretize these higher-dimensional spaces, and on the other hand it seems 
that a different second-slowest process is found when looking at higher numbers of dimensions. The 
kinetic map results show significantly larger timescales than any of the other metrics. These timescales 
do not converge within the range of lagtimes shown, but it is known from [20] that the Benzamidine 
coordinates relative to trypsin are actually not sufficient to characterize the slowest processes in the 
system which are comprised of trypsin conformational switches. 

Fig.H shows the cumulative kinetic variance as a function of the kinetic map dimension. 95% of the 
variance are obtained after 52 out of 233 dimensions, indicating that the data can be reduced by a 
factor of about 4-5 with little losses. 

Fig. P shows the first two dimensions of the trypsin-benzamidine kinetic map with correctly scaled 
coordinates. Three metastable states are apparent in this projection whose structures are depicted in 
Fig. i^.l -3. According to the MSM on the input coordinates used here, the slowest conformational 
transition is the binding unbinding transition, although we know that slower transitions exist in the 
protein conformation [20|. The second-slowest process involves exchange with a binding intermediate 
where the ligand interacts with trypsin residues close to the binding site. Qualitatively, this analysis 
agrees with previous analyses of that system that used the trypsin coordinates [7|, but the kinetic map 
shows larger relaxation timescales. Following the variational principle this means a better model was 
found here. 


4 Discussion 

The kinetic distance defined here is the optimal distance metric for analyzing metastable molecular 
dynamics. In practice, the kinetic distance can only be computed approximately and requires a method 
that approximates the eigenvalues and eigenfunctions of the Markov backward propagator underlying 
the MD simulation. Here we suggest to use the time-lagged independent component analysis (TICA) 
as a fast and convenient method to transform a large set of Cartesian or internal coordinates of the 
molecule into such an approximation. Subsequently, the kinetic distance can be computed. The TICA 
coordinates can be transformed into a kinetic map by weighting them with their eigenvalues. In this 
kinetic map, kinetic distances and Euclidean distances are approximately equivalent, which makes it an 
excellent space for visualization and further analyses such as clustering, Markov modeling, or diffusion 
map. 

We have shown that as long as pathological cases are avoided, the kinetic distance provides a distance 
metric that builds better Markov models, because the TICA components are weighted in such a way 
that the clustering algorithm can optimally concentrate on the slow coordinates. Instead of projecting 
onto an arbitrary number of TICA dimensions as in previous work, or trying to select optimal values 
using machine learning techniques, the present theoretical insights lead to a unique and indisputable 
choice of using all coordinates in a weighted form. To reduce computational effort, a controlled 
truncation of the TICA space can be made by defining the percentage of the cumulative variation in 
kinetic distance (e.g. 95%). 
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The method can be improved by using methods that provide better approximations of the true eigen¬ 
values and eigenfunctions. Much can be done here by choosing a more suitable set of input coordinates 
for the TICA calculation, but a host of other related methods such as the Variational Approach for 
conformation dynamics, Diffusion maps, and Markov transition models are available that may further 
improve this estimate. 

The approach described herein is implemented in the development version of pyEMMA (available at 
github.com/markovmodel/pyemma), and will be included in the next official release (see www.pyemma.org 
for download instructions, documentation and examples). We have made kinetic maps the default set¬ 
ting when computing a TICA transform. 
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a) 



Figure 1: Comparison of Markov models using scaled and unsealed TIC A. (a) The data was generated 
by a two-state Hidden Markov model with Gaussian output distributions, (b) Implied relaxation 
timescales using scaled and unsealed TICA followed by regular-space clustering with 28 clusters each. 
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Figure 2: Comparison of Markov models using scaled and unsealed TIC A. (a) The data was generated 
by a two-state Hidden Markov model with Gaussian output distributions, (b) Implied relaxation 
timescales using scaled and unsealed TICA followed by regular-space clustering with 28 clusters each. 
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Figure 3: Comparison of Markov models of BPTI (1 millisecond Anton trajectory [33]) using different 
TICA projections and the kinetic map. TICA using the Ca coordinates of oriented BPTI configu¬ 
rations. MSMs built based on projections onto 2, 6 and all dimensions and the kinetic map were 
compared, (a) The two slowest implied relaxation timescales, (b) Mean of the two slowest relaxation 
timescales, (c) Cumulative variance of the diffusion distance. 95% is reached by using 13 eigenvectors, 
(d) Kinetic map (first two dimensions), (e.1-3) Structures of the metastable states indicated in (d). 
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Figure 4: Comparison of Markov models of trypsin-Benzamidine dynamics (491 GPUgrid trajectories 
of 100 ns each 0) using different TICA projections and scaled TICA. TICA using the distances from 
Benzamidin to all Trypsin-Ca coordinates had 135 usable eigenvalues. Projections onto 2, 5, 10 and 
all 135 dimensions (unsealed) were compared to scaled TICA. (a) Implied relaxation timescales, (b) 
Mean relaxation timescale, (c) Cumulative variance of the diffusion distance. 95% is reached by using 
50 eigenvectors, (d) Kinetic map (first two dimensions), (e) MD configurations sampled from the three 
metastable states visible in d (1: bound, 2: dissociated, 3: pre-bound). 
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